viii        Contents

Chapter 5        RNA-Seq Data Analysis

163

5.1 INTRODUCTION TO RNA-SEQ

163

5.2 RNA-SEQ APPLICATIONS

165

5.3 RNA-SEQ DATA ANALYSIS WORKFLOW

166

5.3.1

Acquiring RNA-Seq Data

166

5.3.2

Read Mapping

167

5.3.3

Alignment Quality Assessment

171

5.3.4

Quantification

172

5.3.5

Normalization

174

5.3.5.1 RPKM and FPKM

174

5.3.5.2 Transcripts per Million

175

5.3.5.3 Counts per Million Mapped Reads

175

5.3.5.4 Trimmed Mean of M-values

175

5.3.5.5 Relative Expression

176

5.3.5.6 Upper Quartile

176

5.3.6

Differential Expression Analysis

176

5.3.7

Using EdgeR for Differential Analysis

180

5.3.7.1 Data Preparation

181

5.3.7.2 Annotation

183

5.3.7.3 Design Matrix

184

5.3.7.4 Filtering Low-Expressed Genes

185

5.3.7.5 Normalization

186

5.3.7.6 Estimating Dispersions

186

5.3.7.7 Exploring the Data

189

5.3.7.8 Model Fitting

194

5.3.7.9 Ontology and Pathways

202

5.3.8

Visualizing RNA-Seq Data

204

5.3.8.1 Visualizing Distribution with Boxplots

206

5.3.8.2 Scatter Plot

207

5.3.8.3 Mean-Average Plot (MA Plot)

208

5.3.8.4 Volcano Plots

209

5.4 SUMMARY

209

REFERENCES

211